Efficient Performance Prediction for Large-Scale, Data-Intensive Applications

نویسندگان

  • Tahsin M. Kurç
  • Mustafa Uysal
  • Hyeonsang Eom
  • Jeffrey K. Hollingsworth
  • Joel H. Saltz
  • Alan Sussman
چکیده

This paper presents a simulation-based performance prediction framework for large-scale, data-intensive applications on large-scale machines. The framework consists of two components: application emulators and a suite of sim-ulators. Application emulators provide a parameterized model of data access and computation patterns of the applications and enable changing critical application components (input data partitioning, data declustering, processing structure, etc.). The suite of simulators executes quickly on a high performance workstation to allow performance prediction of large-scale parallel machine configurations. The key to efficient simulation of very large configurations is to elide the majority of low-level hardware events while preserving data dependencies and distributions. The authors evaluate their performance prediction tool using a set of three data-intensive applications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Performance Prediction Framework for Data Intensive Applications on Large Scale Parallel Machines

This paper presents a simulation-based performance prediction framework for large scale data-intensive applications on large scale machines. Our framework consists of two components: application emulators and a suite of simulators. Application emulators provide a parameterized model of data access and computation patterns of the applications and enable changing of critical application component...

متن کامل

E2DR: Energy Efficient Data Replication in Data Grid

Abstract— Data grids are an important branch of gird computing which provide mechanisms for the management of large volumes of distributed data. Energy efficiency has recently emerged as a hot topic in large distributed systems. The development of computing systems is traditionally focused on performance improvements driven by the demand of client's applications in scientific and business domai...

متن کامل

Performance Prediction for Data Intensive Applications on Large Scale Parallel Systems

This paper presents a new interactive performance estimation tool – PetaSIM for large scale parallel systems. Our main approach is to divide the difficult performance estimation problem into three domains: application, software and hardware, to extract the system specifications and provide tools for the interactive changes of the system parameters over the Internet. Computers, networks and appl...

متن کامل

Performance Prediction for Large Scale Parallel Systems

In both the design of parallel computer systems and the development of applications, it is very important to have good performance prediction tools. This paper describes a new approach -PetaSIM, which is designed for the rapid prototyping stage of machine or application design. Computers, networks and applications are described as objects in a Java IDL (Interface Definition Language) with speci...

متن کامل

Scalable Alignment Kernels via Space-Efficient Feature Maps

String kernels are attractive data analysis tools for analyzing string data. Among them, alignment kernels are known for their high prediction accuracies in string classifications when tested in combination with SVMs in various applications. However, alignment kernels have a crucial drawback in that they scale poorly due to their quadratic computation complexity in the number of input strings, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • IJHPCA

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2000